Detecting Hot Spots Using Cluster Analysis and GIS
نویسندگان
چکیده
One of the more popular approaches for the detection of crime hot spots is cluster analysis. Implemented in a wide variety of software packages, including CrimeStat, SPSS, SAS, and SPLUS, cluster analysis can be an effective method for determining areas exhibiting elevated concentrations of crime. However, it remains a particularly challenging task to detect hot spots using clustering techniques because of the uncertainty associated with the appropriate number of clusters to generate as well as establishing the significance of individual clusters identified. This paper highlights the potential utility of several diagnostics for resolving such issues. Introduction Crime mapping and analysis have evolved significantly over the past 30 years. In the beginning, many agencies utilized city and precinct maps with colored pins to visualize individual crime events and crime plagued areas. Today, with the rapid advancement of technology, computer-based techniques for exploring, visualizing, and explaining the occurrences of criminal activity have been essential. One of the more influential tools facilitating exploration of the spatial distribution of crime has been GIS (Ratcliffe and McCullagh, 1999; Harries, 1999). As Murray et al. (2001) note, it is the ability to combine spatial information with other data that makes GIS so valuable. Furthermore, the sheer quantity of information available to most analysts necessitates an intelligent computational system, able to integrate a wide variety of data and facilitate the identification of patterns with minimal effort. Fundamental to the explanation of criminal activities in a spatial context are certain environmental factors, such as the physical layout of an area, proximity to various services, and land use mixes all of which are likely to influence criminal behavior (Greenburg and Rohe, 1984). Issues of access, exposure, opportunity, and the availability of targets are also important elements in helping explain crime from an environmental perspective (Cohen and Felson, 1979; Brantingham and Brantingham, 1981). Not surprisingly, research indicates that certain areas are more prone to higher concentrations of crime. Widely labeled as ‘hot spots’, such areas are often targets of increased manpower from law enforcement agencies in an effort to reduce crime. Where resources are concerned, the identification of hot spots is helpful because most police departments are understaffed. As such, the ability to prioritize intervention through a geographic lens is appealing (Levine, 1999a). Operationally, the delineation of hot spot boundaries is somewhat arbitrary. As Levine (1999a) notes, crime density is measured over a continuous area. Therefore, the boundaries separating hot spots of crime from areas without enough activity to merit the label hot spot are perceptual constructs. Moreover, depending on the scale of geographic analysis, a hot spot can mean very different things (Harries, 1999). Recent studies by the Crime Mapping Research Center at the National Institute of Justice categorize hot spot detection and analysis methods. These techniques have been classified as follows (Jefferis, 1999; Harries, 1999): visual interpretation, choropleth mapping, grid cell analysis, spatial autocorrelation, and cluster analysis. Further, twelve different variations on the five classes of hot spot identification techniques were systematically documented and evaluated, yielding several important results. Although there are a variety of methods for detecting hot spots in crime event data, no single approach was found to be superior to others. What does become clear in previous work on hot spot detection is that combining cartographic visualization of crime events with statistical tools provides valuable insight for detecting areas of concern. Results of the CMRC (1998) study suggest that a good approach for detecting hot spots are tests of spatial autocorrelation. Implemented in a variety of packages, including CrimeStat 1.1, SpaceStat, and Splus Spatial Statistics, and SAGE, both global and local tests of spatial autocorrelation assist in crime analysis. As demonstrated by Szakas (1998) the implementation of the Getis-Org statistic (Gi statistic) in SpaceStat provided very good measures of crime hot spots for Baltimore County. The utility of spatial autocorrelation and the Gi statistic for hot spot analysis is further supported in the work of Craglia et al. (2000). Considering the success of statistically grounded tests for hot spot detection, such as the Gi statistic for spatial autocorrelation, it is unfortunate that other well-established statistical tests, such as cluster analysis, are generally viewed to be less useful (Chainey and Cameron, 2000). Gordon (1999) suggests that cluster analysis is one of the most useful methods for exploratory data analysis, especially in large multivariate data sets. If this is the case, why has it failed to help crime analysts in hot spot detection? Statistical approaches for cluster analysis are widely available in a number of software packages, including CrimeStat, SAS, SPSS, Systat, and SPlus. However, the evaluation, documentation, and implementation of cluster analysis algorithms, particularly nonhierarchical versions commonly used in crime analysis such as k-means, are not clear nor do they give direction for useful application (Murray and Estivill-Castro, 1998; Murray and Grubesic, 2002). The gap between what has been developed and what is actually needed for hot spot detection exists for several reasons. First, crime hot spots are spatial phenomena. Therefore, in order to identify elevated concentrations of crime in a geographic area, tools that treat space appropriately are critical. Second, existing approaches for cluster analysis are not necessarily ideal when applied to spatially referenced data. This is best reflected by the relatively poor performance of the k-means algorithm, as implemented in leading statistical packages, for spatial data analysis (Murray and Grubesic, 2002). Given that non-hierarchical clustering approaches have proven fruitful in other research areas, it is premature to deem such techniques too complex or poorly performing for crime analysis as done by Chainey and Cameron (2000). The failure to date of non-hierarchical techniques is a product of how they are being used and supported. The purpose of this paper is to explore two of the problematic aspects of cluster analysis for hot spot detection. First, we examine the difficulties in determining the appropriate number of clusters, p, to generate. Second, we highlight several statistical methods that have the potential for establishing the significance of clusters identified as hot spots. The remainder of this paper is organized as follows. Section 2 outlines the differences between hierarchical and partitioning techniques for cluster analysis. Section 3 examines the problems associated with identifying the appropriate number of clusters. Included is a discussion of several approaches that have the potential to make the identification of the number of clusters more statistically grounded. Section 4 explores the issues of attaching significance to the clusters generated in hot spot detection. Section 5 contains a brief discussion and closing remarks. 2. Clustering Approaches
منابع مشابه
Pattern Analysis of City-Spatial Growth by Spatial Statistics (Case Study: Gorgan City)
Urban planning has gone from the past to the present day to a greater extent based on physical factors, to the extent that the basis of urban planning plans and urban plans influenced urban development and physical spaces. It's The city is the product of complex economic and social relations and its spatial heterogeneity reflects the processes of widespread socio-economic-cultural-social ...
متن کاملCluster recognition in spatial-temporal sequences: the case of forest fires
Forest fire sequences can be modelled as a stochastic point process where events are characterized by their spatial locations and occurrence in time. Cluster analysis permits the detection of the space/time pattern distribution of forest fires. These analyses are useful to assist fire-managers in identifying risk areas, implementing preventive measures and conducting strategies for an efficient...
متن کاملImplementing Exploratory Spatial Data Analysis Methods For..
This paper reports on the development of prototype software designed for exploratory visualization of geographically referenced health statistics. The software prototype provides a number of interactive methods for exploring relationships between risk factors and mortality rates and how they are distributed in space. The use of geographically referenced mortality data to detect disease "hot spo...
متن کاملAcademic Hot-Spot Analysis on Information System Based on the Co-Term Network
The amount of research literature is increasing so fast that the scholars are hard to clearly know the state of art about a certain research field. For IS scholars, understanding research hot-spots among numerous academic papers on IS field is always a significant and key task. In this paper, taking Information System field as example, an academic hot-spot analysis method is proposed to automat...
متن کاملسنجش میزان زیستپذیری منطقهی دو شهر سنندج
Today urban livability reflects a powerful discourse in urban development and city design that is prevalent in urban planning literature suggests that there is an ideal relationship between the urban environment and the social life .On the one hand, the livability indicates the strong urban influence and attraction. On the other hand, the livability will further strengthen the urban connectivit...
متن کامل